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"Input file Fbh32142FL.seq; Output File 32142. trans 
Sequence length 2660 

M A A T 4 

CCTTTNTNRCCACGCGTCCGAGAGCGCCCCGCAGTCTTCGCGGAAAGCGTTCGGGGTAGGCG ATG GCT GCG ACG 12 

RAGPRAREI FTSLEYGPVPE 24 

CGT GCA GGG CCC CGC GCC CGC GAG ATC TTC ACC TCG CTG GAG TAC GGA CCG GTG CCG GAG 72 

SHACALAWLDTQDRCLGHYV 44 

AGC CAC GCA TGC GCA CTG GCC TGG CTG GAC ACC CAG GAC CGG TGC TTG GGC CAC TAT GTG 132 

NGKWLKPEHRNSVPCQD PIT 64 

AAT GGG AAG TGG TTA AAG CCT GAA CAC AGA AAT TCA GTG CCT TGC CAG GAT CCC ATC ACA 192 

GENLASCLQAQAEDVAAAVE 84 

GGA GAG AAC TTG GCC AGT TGC CTG CAG GCA CAG GCC GAG GAT GTG GCT GCA GCC GTG GAG 252 

AARMAFKG WSAHPGVVRAQH 104 

GCA GCC AGG ATG GCA TTT AAG GGC TGG AGT GCG CAC CCC GGC GTC GTC CGG GCC CAG CAC 312 

LTRLAEVIQKHQRLLWTLES 124 

CTG ACC AGG CTG GCC GAG GTG ATC CAG AAG CAC CAG CGG CTG CTG TGG ACC CTG GAA TCC 372 

LVTGRAVREVRDGD VQLAQQ 144 

CTG GTG ACT GGG CGG GCT GTT CGA GAG GTT CGA GAC GGG GAC GTC CAG CTG GCC CAG CAG 432 

LLHYHAIQASTQEEALAGWE 164 

CTG CTC CAC TAC CAT GCA ATC CAG GCA TCC ACC CAG GAG GAG GCA CTG GCA GGC TGG GAG 492 

P M G V I G LI LP PTFSFLEMMW 184 

CCC ATG GGA GTA ATT GGC CTC ATC CTG CCA CCC ACA TTC TCC TTC CTT GAG ATG ATG TGG .552 

RICPAL A.VGCTVVALVPPAS 204 

AGG ATT TGC CCT GCC CTG GCT GTG GGC TGC ACC GTG GTG GCC CTC GTG CCC CCG GCC TCC 612 

PAPLLLAQLAGELGPFPGIL 224 

CCG GCG CCC CTC CTC CTG GCC CAG CTG GCG GGG GAG CTG GGC CCC TTC CCG GGA ATC CTG 672 

NVVSGPASLVPILASQPGIR 244 

AAT GTC GTC AGT GGC CCT GCG TCC CTG GTG CCC ATC CTG GCC TCC CAG CCT GGA ATC CGG 732 

K V 'AFCGAPEEGRALRRS LAG 264 

AAG GTG GCC TTC TGC GGA GCC CCG GAG GAA GGG CGT GCC CTT CGA CGG AGC CTG GCG GGA 792 

ECAEL GLALGTESLLLLT DT 284 

GAG TGT GCG GAG CTG GGC CTG GCG CTG GGG ACG GAG TCG CTG CTG CTG CTG ACG GAC ACG 852 
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ADVDSAVEGVVDAAWSDRGP 304 

GCG GAC GTA GAC TCG GCC GTG GAG GGT GTC GTG GAC GCC GCC TGG TCC GAC CGC GGC CCG 912 

GGLRLL IQESVWDEAMRRLQ 324 

GGT GGC CTC AGG CTC CTC ATC CAG GAG TCT GTG TGG GAT GAA GCC ATG AGA CGG CTG CAG 972 

ERMGRLRSGRGLDGAVDMGA 344 

GAG CGG ATG GGG CGG CTT CGG AGT GGC CGA GGG CTG GAT GGG GCC GTG GAC ATG GGG GCC 1032 

RGAAACDLVQRFVREAQS QG 364 

CGG GGG GCT GCC GCA TGT GAC CTG GTC CAG CGC TTT GTG CGT GAG GCC CAG AGC CAG GGT 1092 

AQVFQAGDVPSERPFYPPTL 384 

GCA CAG GTG TTC CAG GCT GGT GAT GTG CCT TCG GAA CGC CCA TTC TAT CCC CCA ACC TTG 1152 

VSNLPPASPCAQVEVPWTVV 404 

GTC TCC AAC CTG CCC CCA GCC TCC CCA TGT GCC CAG GTG GAG GTG CCG TGG CCT GTG GTC 1212 

VASPFRTAKTJALLVANGTPR 424 

GTG GCC TCC CCC TTC CGC ACA GCC AAG GAG GCA CTG TTG GTG GCC AAC GGG ACG CCC CGC 1272 

GGSASVWSE R JiGQAL ELGYG 444 

GGG GGC AGC GCC AGT GTG TGG AGC GAG AGG CTG GGG CAG GCG CTG GAG CTG GGC TAT GGG 1332 

LQVGTVWINAHGLR DPSVPT 464 

CTC CAG GTG GGC ACT GTC TGG ATC AAC GCC CAC GGC CTC AGA GAC CCT TCG GTG CCC ACA 1392 

GGCKESGCSWHGGPDGLYEY 484 

GGC GGC TGC AAG GAG AGT GGG TGT TCC TGG CAC GGG GGC CCA GAC GGG CTG TAT GAG TAT 1452 

LRPSGTPARLS C LSKNL NYD 504 

CTG CGG CCC TCA GGG ACC CCT GCC CGG CTG TCC TGC.. CTC TCC AAG AAC CTG AAC TAT GAC 1512 

TFGLAVPSTLPAGPEIGPSP 524 

ACC TTT GGC CTC GCT GTG CCC TCA ACC CTG CCG GCT GGG CCT GAA ATA GGG CCC AGC CCA 1572 

APPYGLFVGGRFQAPGARSS 544 

GCA CCC CCC TAT GGG CTC TTC GTT GGG GGC CGT TTC CAG GCT CCT GGG GCC CGA AGC TCC 1632 

R P I RD S S GMLHGYVAEG GAK 564 

AGG CCC ATC CGG GAT TCG TCT GGC AAT CTC CAT GGC TAC GTG GCT GAG GGT GGA GCC AAG 1692 

■D I RGAVEAAHQAF PGWAGQS 584 

GAC ATC CGA GGT GCT GTG GAG GCC GCT CAC CAG GCT TTC CCT GGC TGG GCG GGC CAG TCC 1752 

PGARAALLW ALAAALERRKS 604 

CCA GGA GCC CGG GCA GCC CTG CTG TGG GCC CTG GCG GCT GCA CTG GAG CGC CGG AAG TCT 1812 




Fig. 1B 



Applicants: Rachel E. Meyers, et al. 

Title: 21481, A NOVEL DEHYDROGENASE MOLECULE AND USES 

THEREFOR 
Attorney/Agent: Kerri Pollard Schray 
Docket No.: MPI00-079P 1 RCP2CN I M 

Sheet 3 of 43 

3/43 



TLASRLERQGA 
ACC CTG GCC TCA AGG CTG GAG AGG CAG GGA GCG 


ELK 
GAG CTC AAG 


A 
GCT 


A 

GCG 


E 
GAG 


A 
GCG 


E 

El 

GAG 


V 

GTG 


694 

Oil 

1872 


ELSARRLRAWG 
GAG CTG AGC GCA AGA CGA CTT CGG GCG TGG GGG 


A R V 
GCC CGG GTG 


Q 

CAG 


A 

GCC 


Q 

CAA 


n 

VJ 

GGC 


u 
n 

CAC 


ACC 


01*1 

1932 


LQVAGLRGPVL 
CTG CAG GTA GCC GGG CTG AGA GGC CCT GTG CTG 


R L R 
CGC CTG CGG 


E 
GAG 


p 

CCG 


U 

CTG 


GGT 


v 

V 

GTG 


T, 
Li 

CTG 


664 

1992 


AVVCPDEWPLL 
GCT GTG GTG TGT CCG GAC GAG TGG CCC CTG CTT 


A F V 
GCC TTC GTG 


c 

iJ 

TCC 


r, 

JJ 

CTG 


r, 

LI 

CTG 


a 

GCT 


P 

r 

CCC 


A 

GCC 


2052 


LAYGNTVVMVP 
CTG GCC TAC GGC AAC ACT GTG GTC ATG GTG CCC 


S A A 
AGT GCG GCC 


c 
TGT 


p 

r 

CCT 


Jj 

CTG 


T 

LI 

CTG 


A 

GCC 


T 
Jj 

CTG 


2112 


EVCQDMATVFP 
GAG GTC TGC CAG GAC ATG GCC ACC GTG TTC CCA 


ACT, 

r\ vj iJ 

GCA GGC CTG 


GCC 


M 
a 

AAC 


V 

GTG 


V 

GTG 


T 
1 

ACA 


n 

VJ 

GGA 


70 A 

2172 


D R D H L T R C L A L 
GAC CGG GAC CAT CTG ACC CGC TGC CTG GCC TTG 


HOD 

i* y U 

CAC CAA GAC 


V 

GTC 


n 

V 

CAG 


GCC 


M 
PI 

ATG 


w 

n 

TGG 


v 
TAT 


1 A A 

2232 


F G SAQ GSQFVE 
TTC GGA TCA GCC CAG GGT TCC CAG TTT GTC GAG 


W A S 
TGG GCC TCG 


A 
GCA 


G 
GGA 


N 
AAC 


L 
CTC 


K 
AAA 


P 
CCG 


764 
2292 


VW ASRG CPRAW 
GTG TGG GCG AGC AGG GGC TGC CCG CGG GCC TGG 


D Q E 
GAC CAG GAG 


A 

GCC 


E 
GAG 


G 
GGG 


A 

GCA 


G 
GGC 


P 
CCA 


784 
2352 


ELGLRVA RTKA 
GAG CTG GGG CTG CGA GTG GCG CGG ACC AAG GCC 


L W L 
CTG TGG CTG 


P 
CCT 


M 
ATG 


G 
GGG 


D 
GAC 


* 

TGA 




803 
2409 



TGCCTGAGCGCCACCTACTGCATTTTGGACACCTCACACCAAGGGGAGATGCACCCCACAGACACCTGGGACTTTCCCC 
TTCTGGTTCCTGTGTCTCC(^TAAACTCTCTGACCMCCCTAAAAAAAAAAAAAAAAAAAAAAAAAARWARMAACTTC 



TGGCAGATATGAGGCTTTTTTCTTTTTTTTT 
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Transmembrane Segments Predicted by MEMSAT 
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Protein Family / Domain Matches, HMMer Version 2 
Searching for complete domains in PFAM 
hmmpfam - search a single seq against HMM database 
HMMER 2,1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 
HMMER is freely distributed under the GNU General Public License (GPL) 



HMM file: 
Sequence file: 



/prod/ddm/seqanal/PFAM/pf am5 . 0/Pf am 
/prod/ddm/wspace/orfanal/oa- script. 9519 .seq 



Query: 32142 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value 



aldedh Aldehyde dehydrogenase family 



149.8 



4.7e-41 



Parsed for domains: 

Model Domain seq- f, seq- t 



aldedh 



1/1 



47 494 



hmm- f hmm- t 

£ "492 [] 



score E-value 
149~8 4~7e-41 



Alignments of tCp-scoring domains: 

aldedh: domain I" of 1, from 47 to 494: score 149.8, E = 4.7e-41 

* - >ewdsasgkt f evvNPankgevigrvpeataeDvdaAVkAAkeAf ks 
+w +++ + +++ +P + ge +++ +a+aeDv aAV AA+ Afk+ 
32142 47 KWLWPEHRNSVPCQDPIT - GENLASCLQAQAEDVAAAVEAARMAFKG 92 

GpwWakvpaseRariLrkladlieeredeLaaletlDlGKplaeAkgDte 
W++ p Ra+ L +la+ i+ ++ +L le+1 +G ++e+ + + 
32142 93 -- - WS AHPGWRAQHLTRLAEVIQKHQRLLWTLES LVTGRAVREVRDG - D 1 3 8 

vgraideiryyagwarklmgerrvipslatdgdeelnytrrePlGVvgvI 
v+ a + ++y a +a+ t+ e ++ +eP GV+g I 
32142 139 VQLAQQLLHYHAIQAS TQ EEALAGWEPMGVIGLI 172 

sPWNFPlllalwklapALAaGNTVVlKPSEqTPlt. . alllaelieeaGa 
P F +1 ++w ++pALA G+TW + P+++ 111a 1 e G 
32142 172 LPPTFSFLEMMWRICPALAVGCTW ALVPPASpaPLLLAQLAGELG- 218 

nnlPkGVvnwpGfGaevGqaLlshpdidkisFTGSteVGklimeaAAak 
+G +nw G +a+ + L+s+p+i+k++F G +e G+ + ++ A + 
32142 219 - - P F P G I LNWSG - PAS L VP I LAS Q P G I RKVAFCG APE EGRALRRS L AGE 265 

nlkkVtLELGGKsPvIVfdDADLdkAverivfgaFgnaGQvCiApsRllv 
+ L LG s d AD d Ave++v +a G ++ R11 + 
32142 266 -CAELGLALGTESLLLLTDTADVDSAVEGWDAAWSDRG PGGLRLLI 311 

hesiydeFveklkervkklkliGdpldsdtniyGPHseqqfdrvlsyle 
+es+ de + +l+er+ +1+ G +ld + + G+ +++ d v +++ 
32142 312 QESVWDEAMRRLQERMGRLR - SGRGLDGAVDM- GAR - GAAACDLVQRFVR 358 

dgkeeGAkvlcGGerdeskeylggGyyvqPTi f tdVtpdMklmkEEIFGP 
+++++GA+V + G ++ + + ++ PT+++++ p +++++ E+ P 
32142 359 EAQSQGAQVFQAGDVPSE RP FYPPTLVSNLPPASPCAQVEVPWP 402 

VlpiikfkdldEAIelaNdteYGLAayvFTkdilarafrvakaleaGiVw 
V++ f++ EA+ aN t+ G +a+v+++ l a +1++G+Vw 
32142 403 VWASPFRTAKEALLVANGTPRGGSASVWSER - LGQALELGYGLQVGTVW 451 



32142 452 



vNDvcvhaaepqlPFGGvHqSSGiGrehgGkygleeYteiKtVt irl< - * 
+N ++ +p++P GG K+ SG + ++ G++gl eY++ + rl 
IN-- AHGLRDPS VPTGGCKE - SGCSWHG - GPDGLYEYLRPSGTPARL 



494 



Fig. 4 
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ProDom Matches 



Prodomld 



Start 



End 



Description 



Score 



View Prodom 135 | Boxer [j 



Showing match 7^1 [Go! 



101 



770 



p99.2 (229) DHAL(IO) DHAB(IO) DHAM(7) // 
DEHYDROGENASE OXIDO REDUCTASE ALDEHYDE 
NAD PROTEIN CLASS SEMIALDEHYDE PRECURSOR 
TRANSIT PEPTIDE 



280 



Prodomld 



Start 



End 



Description 



Score 



View Prodom 135 | Boxer \r\ \ Showing match "pi l^o*! 



>135 p99.2 (229) DHAL (10) DHAB (10) DHAM(7) // DEHYDROGENASE OXIDOREDUCTASE 
ALDEHYDE NAD PROTEIN CLASS SEMIALDEHYDE PRECURSOR TRANSIT PEPTIDE 
Length = 494 

Score = 280 (103,6 bits), Expect = 7.8e-22, P = 7,8e-22 
Identities = 87/289 (30%), Positives = 142/289 (49%) 

Query: 216 ELGPFPGILNWSG--PASLVPILASQPGIRKVAFCGAPEEGRALRRSXXXXXXXXXXXX 273 

E G PG++NW+G A + L S P I K++F G+ E G+A+ ++ 
Sbjct: 194 EAGLP PGVINWTGFGGAEVGEALVSHPDIDKI S FTGSTEVGKAIMKAAAEKNLKPVTLE 253 

Query: 274 XXXXX - - XXXXDTADVDSAVEGWDAAWSDRGP GGLRLL IQESVWDEAMRRLQERMG 328 

D D+D AVE W A+ + G R+ +QES++DE + +L ER+ 

Sbjct: 254 LGGKNPVIVFEDADDLDKAVESVVFGAFFNSGQVCTAASRIFVQESIYDEFVEKLVERVK 313 

Query: 329 RL - RSGRG - - LDGAVDMGAR - GAAACDLVQRFVREAQSQGAQVFQAGD VPSERPFY - 380 

+L + G LD DMG + +Q ++ EA+++GA++ G+ E ++ 

Sbjct: 314 KLLKVGEDDPLDPDTDMGPLINEEQYEKIQSYIEEAKAEGAKLVCGkSERRKAGDEGGYFI 373 

Query: 381 PPTLVSNLPPASPCAQVEVPWPVWASPFRT-AKEALLVANGTPRGGSASWSERLGQAL 439 

PT+++++ Q E+ PV+ F+ EA+ +AN T G +A V++ + >A 

Sbjct: 374 QPTILTDVTEDMRIMQEEIFGPVLPVIKFKDDLDEAIELAN^ 433 

Query: 440 ELGYGLQVGTVWINA HGLRDPSVPTGGCKESGCSWH-GGPDGLYEY 484 

+ L+ GTVW+N H + P GG K+SG GG GL EY 

Sbjct: 434 RVAERLEAGTVWVNDNIYHVSAEAQAPFGGYKQSGIGGREGGKYGLEEY 482 

Score = 262 (97.3 bits). Expect = 8.2e-20, P = 8.2e-20 
Identities = 86/301 (28%) , Positives = 140/301 (46%) 



Query: 
Sbjct: 
Query: 
Sbjct: 
Query: 
Sbjct: 
Query: 
Sbjct: 
Query: 
Sbjct: 
Query: 
Sbjct: 



101 RAQHLTRlAEVIQKHQRLLWTLESLVTGRAVREVRDGDVQLAQQLIJiYHA 150 

RA+ L +LA+++++++ L LE+L TG+ + E + +V A L Y+A 
61 RARILRKLADLLEENKDELAALETLETGKPLAEACTAEVARAVDYLRYYA 120 

151 -IQASTQEE ALAGWEPMGVIGLILPPTFSFLEMMWRICPALAVGVTXX- - -XXXXX 202 

I S E + EP+GV+ I P F + +W+I PALA G T 

121 TIPTSLSESPGSMSYTMREPLGWAAITPWNFPLMMAVWK 180 

203 XXXXXXXXXXXXGELGPFPGILNWSG- -PASLVPILASQPGIRKVAFCGAPEEGRALRR 260 

E G PG++NW+G A + L S P I K++F G+ E G+A+ + 
181 LTALLIJ^LIKEAEAGLPPGVINVVTGFGGAEVGEALVSHPDIDKISFTGST^GKAI^ 240 

261 SXXXXXXXXXXXXXXXXX--XXXXDTADVDSAVEGV^ 315 
^ A + D D+D AVE W A+ + G R+ +QES+ 

241 AAAEKNLKPVTLELGGKNPVIVFEDADDLDKAVESVVF 300 

316 WDEAMRRLQERMGRL - RSGRG - - LDGAVDMGAR - GAAACDLVQRFVREAQSQGAQVFQAG 371 

+DE + +L ER+ +L + G LD DMG + +Q ++ EA+++GA++ G 

301 YDEFVEKLVERVKKLLKVGEDDPLDPDTDMGPLINEEQYEKIQSYIEEAKAEGAKLVCGG 360 



372 D 372 
+ 

361 E 361 
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Score = 219 (82.2 bits), Expect = 4.9e-15, P = 4.9e-15 
Identities = 75/236 (31%), Positives = 105/236 (44%) 

Query: 550 S SGNLHGYVAEGG AKD I RGAVEAAHQ AF PG - -WAGQSP -GXXXXXXXXXXXXXERRKSTL 606 

+ +G + V E +D+ AVEAA +AF G W SP E K L 

Sbjct: 20 TNGEVIAQVPEATKEDVDKAVEAAREAFKGGEWGKT3P 79 

Query: 607 AS - -RLERQGXXXXXXXXXXXXXXRRLRAW-GARVQAQGH-TLQVAGLRGP VLRLRE 659 

A+ LE LR + G + G T+ + P +RE 

Sbjct: 80 AALETLETGKPLAEAKVAEVARAVDYLRYYAGMAEKLMGEETIPTSLSESPGSMSYTMRE 139 

Query: 660 PLGVIAWCPDEWPLLAFVSLLAPAI^TGNTVVMVPSAACPLLAL EVCQDMATVFPA 716 

PLGV+A + P +PL+ V +APALA GNTW+ PS PL AL E+ ++ P 
Sbjct: 140 PLGWAAITPWNFPLMMAVWKIAPALAAGNTV 199 

Query: 717 GLANWTG -DRDHLTRCLALHQDVQAMOTFGSAQ -GSQFVEWASAGNLKPVWASRG 770 

G+ NWTG + L D D+ + + GS + G ++ A+ NLKPV G 

Sbjct: 200 GVINVVTGFGGAEVGEALVSHPDIDKISFTGSTEVGKAIMKAAAEKNLKPVTLELG 255 



-Fig. 5B 
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Input file Fbh21481FL.seq; Output File 21481. trans 
Sequence length 1379 

TTTGGCCCTCGAGGCCAAGAATTCGGCACGAGGAGCAAGTGGCCTTAACACATGGATTTTCTTCCAAAAATGCAGACCC 

ATTTTAATTAAGTTTGTAATTAACCACTGGGGAGGGCAGGCCCCCTGGATTCGGTCTGCTTTCGGAGACACTGTGAGTA 

ACTTCCTATTTGTTGAACATTTGGGGATTAGCACGCCCACTGGGTGTTCAGCTTGGAGGCTTGCACAGAGCTGAGCTCC 

CTGCAGCCTTGGGCCTCCCCCTGCCCTGGGAGTCCTGATCAGCGTCTCTTTGCAAAGCCAATCCCCTTTTACTCCGTTG 

MGVMAMLMLPLLLLGI 16 
TCCCCCAGAACAAG ATG GGA GTC ATG GCC ATG CTG ATG CTC CCC CTG CTG CTG CTG GGA ATC 48 

SGL LFIYQE VSRLWSKSAVQ 36 
AGC GGC CTC CTC TTC ATT TAC CAA GAG GTG TCC AGG CTG TGG TCA AAG TCA GCT GTG CAG 108 

N ~K VV~VITDAI SGLGKECARV 56 
AAC AAA GTG GTG GTG ATC ACC GAT GCC ATC TCA GGA CTG GGC AAG GAG TGT GCT CGG GTG 168 

FHTGGAR LVLCGKNWERLEN 76 
TTC CAC SCA GGT GGG C-CA AGG CTG GTG CTG TGT GGA AAG AAC TGG GAG AGG CTA GAG AAC 228 

L Y " D A LIS V A D P S K T F T P K L V 96 
CTA TAT GAT GCC TTG ATC AGC GTG GCT GAC CCC AGC AAG ACA TTC ACC CCA AAG CTG GTC 288 

L L DLS D I S CV P D V A K E V L D C 116 
CTG TTG GAC CTC TCA GAC ATC AGC TGT GTC CCA GAT GTG GCA AAA GAA GTC CTG GAT TGC 348 

YGCV DILINNASVKVKGPAH 136 
TAT GGC TGT GTG GAC ATC CTC ATC AAC AAT GCC AGT GTG AAG GTG AAG GGG CCT GCC CAT 408 

KISLELD KKIMDANYFGPIT 156 
AAG ATT TCT CTG GAG CTC GAC AAA AAG ATC ATG GAT GCC AAT TAC TTT GGC CCC ATC ACA 468 

LTKALLPNMISRRTGQIVLV176 
TTG ACG AAA GCC CTG CTT CCC AAC ATG ATC TCC CGG AGA ACA GGC CAA ATC GTG TTA GTG 528 

NNI QG KFGIPFRTTYAASKH 196 
AAT AAT ATC CAA GGG AAG TTT GGA ATC CCG TTC CGT ACG ACT TAC GCT GCC TCC AAG CAC 588 




Fig. 6 A 
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Fig. 6B 
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Signal Peptide Predictions for 2148 1 



Method 



Predict 
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Mat® 
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19 



Note: amino-terminal 70aa used for signal peptide prediction 



Transmembrane Segments Predicted by MEMS AT 
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>21481 
MGVMAML^ 

GARLVLCGKNWERLENLYDALISVADPSKTFTPKLVLLDLSDISCVPDVAKEVLDCYGCV 
DILINNASVKVKGPAHKISLELDKKIMDANYFGPITLTKALLPNMISRRTGQIVLVNNIQ 
GKFGI PFRTT YAASKHAALGFFDCLRAEVEE YD VVI STVS PTF I RS YHVYPEQGNWEAS I 

WKFFFRKLTYGVHPVEVAEEVMRTVRRKKQEVFMANPIPKAAVYVRTFFPEFFFAVVACG 
WJ1K.LNVPEEG 

Transmembrane Segments for Presumed Mature Peptide 
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Orient 


Score 


265 


283 


ihs-->out 


0.2 



>21481 mature 

LLF I YQHVS RLWS KS AVQNKWVT TDAI S GLGKECARVFHTGGARL VLCGKNWERLENIj Y 
DALISVADPSKTFTPKLVLLDLSDISCVPDVAKEVLDCYGCVDILINNASVKVKGPAHKI 
SLELDKKIMDANYFGPITLTKALLPNMISRRTGQIVLVNNIQGKFGIPFRTTYAASKHAA 
LGFFDCLRAEVEEYDVVI STVS PTFIRS YHVYPEQGNWEAS IWKFFFRKLTYGVHPVETVA 
E EVMRTVRRKKQEVFMANP I PKAAVYVRT FF PE F F F AWACGVKEKLNVP EEG 
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Protein Family / Domain Matches, HMMer Version 2 
Searching for complete domains in PFAM 
hmmpfam - search a single seq against HMM database 
HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 
liMMER is freely distributed under the GNU General Public License {GPL) . 




HMM file: /prod/ddm/seqanal/PFAM/pfam5. O/Pf am 

Sequence file : /prod/ddm/wspace/orf anal/oa-script . 9650 . seq 



Query: 21481 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



adhshort short chain dehydrogenase 
A2M Alpha - 2 -mac roglobul in f ami ly 



120.0 
0.5 



4.5e-32 
7.1 



1 
1 



Parsed for domains: 

Model Domain seq-f seq-t 



hmm-f hmm-t 



adh short 
A2M 



1/1 
1/1 



38 
278 



227 
291 



1 
1 



203-1] - 
14 [. 



score E-value 

120.0 4.5e-32 
0.5 7.1 



Alignments of top-scoring domains: 

adhshort: domain 1 of 1, from 38 to 227: score 120.0, E = 4.5e-32 

* - >KvaLvTGassGIGlaiAkrLakeGakVwadrneeklekGavakelk 
Kv+++T a SG+G+++A+ +++ Ga++v+++ n e+le+ ++1 
21481 38 KVVVITDAISGLGKECARVFHTGGARLVLCGKNWERLEN- - LYDALI 82_ 

elGgnd . . kdralaiqlDvtdeesv . aaveqaverlGrlDvLVNNAGgii 
+++++ + lD++d + V+++++++++ +G +D+L+NNA + 

21481 83 SV- ADPskTFTPKLVLLDLSDISCVpDVAKEVLDCYGCVDILINNAS - - V 129^ 

1 lrpgpf ael s r tmeedwdrvidvNl tgvf 11 1 ravlplmamkkrggGr I 
gp++++s +e+ ++++d N++g++ lt+a+lp m+ r+ G I 
21481 130 -KVKGPAHKIS LELDKKIMDANYFGPITLTKALLP--NMISRRTGQI 173 

vNiSSvaGrkegglvgvpggsaYsASKaAvigltrsLAlElaphglrVna 
v + + G + g p+++ Y+ASK+A g+ ++L+ E+ ++ + ++ 
21481 174 VLVNNIQG- KFGIPFRTTYAASKHAALGFFDCLRAEVEEYDWIST 218 



21481 



VAPGgvdTd<-* 
v+P +++ 
219 VSPTFIRSY 227 



A2M: domain 1 of 1, from 278 to 291: score 0.5, E = 7.1 

* - >ideddi tiRSyFPE< - * 
i+ + +R++FPE 
21481 278 IPKAAVYVRTFFPE 291 



Fig. 9 
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ProDom Matches 



Prodomld 


Start 


End 


Description 


Score 


View Prodom 1 1 | Boxer | ▼ 
| Showing match | ▼[ |Go! 


99 


219 


p99.2 (1078) ADH(34) GALE(20) FABG(13)// 
OXIDOREDUCTASE PROTEIN DEHYDROGENASE 
NAD REDUCTASE NADP BIOSYNTHESIS 
SYNTHASE ALCOHOL PUTATIVE 


113 


Prodomld 


Start 


End 


Description 


Score 



View Prodom 11 | Boxer | ^j | Showing match !▼[ [Go! 

>11 p99.2 (1078) ADH(34) GALE (20) FABG (13) // OXIDOREDUCTASE PROTEIN 

DEHYDROGENASE NAD REDUCTASE NADP BIOSYNTHESIS SYNTHASE ALCOHOL PUTATIVE 
Length = 269 



Score = 113 (44.8 bits), Expect = 0.00016, P = 0.00016 
Identities = 41/138 (29%), Positives = 637138 (45%) 



Query: 99 DLSDIS - CVPDVAKEVIjDCYGCVDILINNASVKV-- KGPAHKISLELD - RXIMDANY 151 

D+ D+ V V +E +G +D+L+NNA V K A ++ E +++++ N 

Sbjct: 87 DVEDVEKLVETVVEEFSGIHGKIDVLVNNAGVMAPKAVAESOT 146 

Query: 152 FGPITLTKALLPNMIS RRTGQI VLVNNIQGK - FGI P - FRTTYAASKHAALGF 201 

G LT+A LP M R G IV V ++ G G P + Y+ASK A F 

Sbjct: 147 TGTEmTQAALPAMKKFSDAAAKKRFVGTIVNVASVAGSTMGSPGSQAAYSASKAAVESF 206 

Query: 202 FDCLRAEVEEYDWISTV 219 

L E+ Y ++ V 
Sbjct: 207 TKSLAMELSPYSASVAMV 224 



Fig. 10 
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Input file Fbh25964Fl.seq; Output File 25964. trans 
Sequence length 1725 

GAGAAGGAGGAGCCAGCGGAAGGACGGTGTGCGGGCCGGCCAGCCCTGGACGAAAGAAGAGGGCCCCTCCAGGCCAGTC 
TGGGCACCCTGGGATAGCGGCTGCAG 

GCTCTGCCCTCCCCGCAAACGCCAGCCTCGTCACCGCTCCAGGGCACCTCCAGCAGTAACAGGTGGTTGCAGCAGGTGG 

MADSAQAQK 9 

CAGCCAGCCCCTGGATGAGCCAAGGTCTCTTCCCCAGCCAGGC ATG GCC GAC TCT GCA CAG GCC CAG AAG 27 

LVYLVTGGCGFLGEHVVRML 29 

CTG GTG TAC CTG GTC ACA GGG GGC TGT GGC TTC CTG GGA GAG CAC GTG GTG CGA ATG CTG 87 

LQREPRLGELRVFDQHLGPW 49 

CTG CAG CGG GAG CCC CGG CTC GGG GAG CTG CGG GTC TTT GAC CAA CAC CTG GGT CCC TGG 147 

L E El L KT GPVRVTAIQGDVTQ 69 

CTG GAG GAG CTG AAG ACA GGG CCT GTG AGG GTG ACT GCC ATC CAG GGG GAC GTG ACC CAG 207 

AKEVAA AVAGAKVVIHTAGL 89 

GCC CAT GAG GTG GCA GCA GCT GTG GCC GGA GCC CAT GTG GTC ATC CAC ACG GCT GGG CTG 267 

VDVFGRA S P K T I H E V N V Q G T 109 

GTA GAC GTG TTT GGC AGG GCC AGT CCC AAG ACC ATC CAT GAG GTC AAC GTG CAG GGT ACC 327 

R N V I E ACVQT GTRFLVYTSS 129 

CGG AAC GTG ATC GAG GCT TGT GTG CAG ACC GGA ACA CGG TTC CTG GTC TAC ACC AGC AGC 387 

MEVVG PNTKGHP FYRGNEDT 149 

ATG GAA GTT GTG GGG CCT AAC ACC AAA GGT CAC CCC TTC TAC AGG GGC AAC GAA GAC ACC 447 

PYEAVHRHPYPCSKALAE WL 169 

CCA TAC GAA GCA GTG CAC AGG CAC CCC TAT CCT TGC AGC AAG GCC CTG GCC GAG TGG CTG 507 

VLEANGRKVRGGLPLVTCAL 189 

GTC CTG GAG GCC AAC GGG AGG AAG GTC CGT GGG GGG CTG CCC CTG GTG ACG TGT GCC CTT 567 

RPTGIY GEGHQIMRDFY RQG 209 

CGT CCC ACG GGC ATC TAC'GGT GAA GGC CAC CAG ATC ATG AGG GAC TTC TAC CGC CAG GGC 627 

LRLGGWLFRAIPASVEHGRV 229 

CTG CGC CTG GGA GGT TGG CTC TTC CGG GCC ATC CCG GCC TCT GTG GAG CAT GGC CGG GTC 687 



Fig. 11 A 
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Protein Family / Domain Matches, HMMer Version 2 
Searching for complete domains in PFAM 
hmmpfam - search a single seq against HMM database 
HMMER 2.1.1 (Dec 1998) 

Copyright (C) 1992-1998 Washington University School of Medicine 
HMMER is freely distributed under the GNU General- Public License (GPL) 



HMM file: 
Sequence file: 



/prod/ddm/seqanal/PFAM/pfam5.0/Pfam 
/prod/ddm/wspace/orfanal/oa-script. 9289 .seq 



Query: 25964 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value 



N 



3Beta HSD 



3 -beta hydroxysteroid dehydrogenase/iso 
S-AdoMet_synt S-adenosylmethionine synthetase 
adh short short chain dehydrogenase 

Epimeras~e NAD dependent epimerase/dehydratase fam 

Parsed for domains: 



676.9 
1.8 
-48.6 
-148.0 



le-199 1 

0.78 1 

0.022 1 

0.0016 1 



Model 


Domain 


"seq-f seq-t 


hmm-f 


hmm- 1 




score 


E-value 


adh short 


1/1 


10 


197 


1 


203 


[] 


-48.6 


0.022 


S-AdoMet synt 


1/1 


341 


351 


365 


376 


.] 


1.8 


0.78 


3Beta_HSD 


1/1 


1 


365 


[. 1 


425 


[] 


676.9 


le-199 


Epimerase 


1/1 


12 


365 


1 


359 


[] 


-148 . 0 


0.0016 



Alignments of top-scoring domains: 

adh short: domain 1 of 1, from 10 to 197: score -48.6, E = 0.022 

* - >KvaLv i l 1 GassGIGlaiAkrLakeGakVwadrneeklekGavalelk 
v LvTG+++ +G +++ L+ + ++ ++ + G +++elk 
25964 10 LVYLVTGGCGFLGEHWRMLLQR- - EPRLGELRVFDQHLGPWLEELK 54 



25964 



elGgndkdralaiqlDvtdeesv.aaveqaverlGrlDvLVNNAGgiill 
+ r+ aiq+Dvt++ +v aav+ a +v++ AG + 

55 TGPV RVTAIQGDVTQAHEVaAAVAGA HWIHTAG- -L- - 89 



25964 



90 



rpgpfaelsrtmeedwdrvidvNltgvflltravlplmamkkrggGrlvN 

+ f + s ++ +++vN+ g tr v++ a ++ g v 
-VDVFGRAS PK- - -TIHEVNVQG TRNVIE- -ACVQTGTRFLVY 126 



iSSvaGrke g. glvgvpggsaYsASKaAvigltrs 

+SS +e ++++++++ +++ + + ++ +Y +SKa 1++ 
25964 127 TSS-— MEwgpntkghpfyrgnEdTPYEAVHRHPYPCSKA LAEW 168 

LAlElaphglr VnavaPGgvdTd< - * 

L IE +++r++ + a P g++ + 
25964 169 LVLEANGRKVRgglplvTCALRPTGIYGE 197 



S-AdoMet_synt : domain 1 of 1, from 341 to 351: 

* - >HFGreevdFpWE< - * 
HFG e F+WE 
25964 341 HFGYEP-LFSWE 351 



score 1.8, E = 0.78 



Fig. 14 A 



Applicants: Rachel E. Meyers, et al. 

Title: 21481, A NOVEL DEHYDROGENASE MOLECULE AND USES 

THEREFOR 
Attorney/Agent: Kerri Pollard Schray 
Docket No.: MPIO0-079P1 RCP2CN I M 

Sheet 20 of 43 

20/43 

3Beta_HSD: domain 1 of 1, from 1 to 365: score 676.9, E = le-199 

*->elsesldmaglsclVTGGgGFlGrhIVreLlregeslqevRvfDlrf 
+++S++ l++lVTGG+GFlG+h Vr+Ll+++++1 e+RvfD + 
2 5964 1 -MADSAQAQKLVYLVTGGCGFLGEHWRMLLQREPRLGELRVFDQHL 46 




_. ..... _ spelde.dssklqvitkikyieGDv.tDkqdlaa^ 

+p+l+e +++++ v+ +i+GDvt+++++aaA++g+ 
2 5964 47 GPWLEE 1KTGP VRVT AIQGDVTQAHEVAAAVAGA 80 

dvvIHtAaiiDvfGelrvsGSDLSFGVTVLFLAVTEGSYVVFYmGATDLR 
+wIHtA+++DvfG 

25964 81 HWIHTAGLVDVFG 94 

kasrdrimkVNVkGTqnvldACveaGVrvlVYTSSmeWGpNsrGqpivN 
as+ +i++VNV+GT+nv++ACv++G+r+lVYTSSmeWGpN +G+p+++ 
25964 95 RASPKTIHEVNVQGTRNVIEACVQTGTRFLVYTSSMEWGPNTKGHPFYR 144 

GdEttpYestDDhqdaYpeSKalAEklVLkANGsmlknGgrLyTCALRPa 
G+E+tpYe++ h+++Yp+SKalAE 1VL+ANG+ +++G L+TCALRP+ 
25964 145 GNEDTPYEAV- -HRHPYPCSKALAEWLVLEANGRKVRGGLPLVTCALRPT 192 

glfGeGdqflvpflrqlvknGlakfriGdknalsdrVYVgNVAwAHILAA 
gl+GeG q + + f+rq +++G+ +fr ++ + rVYVgNCAw+H+LAA 
25964 193 GIYGEGHQIMRDFYRQGLRIiGGWLFRAIPASVE 242 

raLqdpkkGREGassiaGqaYFIsDdsPvnSYddFnrtllkalGlrlpst 
r+L+++ a+ + Gq+YF++D+sP++SY+dFn+++l ++Glrl + 

25964 243 RELEQR AALMGGQVYFGYDGSPYRSYEDFNMEFLGPGGLRLVGA 286 

w.rlPlpllyvlaylnellswLLrklalrYtPllnpytvtlanttFtfst 
++lP++ll++la+ln+ll+wLLr+l + Y Pllnpyt+++anttFt+st 
25964 287 RpLLPYWLLVFLAALNALLQWLLRPL-VLYAPLLNPYTLAVANTTFTVST 335 

nKAkkdLGYePlvtwEEarakTieWiqele< - * 
+KA++++GYePl++wE +r +Ti+W+q+ 
25964 336 DKAQRH FGYE P LF S WED S RTRT I LWVQAAT 365 
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Epimerase: domain 1 of 1, from 12 to 365: score -148.0, E = 0.0016 

*->ILVTGGAGFIGShlvreLlnn. . . ygddkVwLDnLtdyYqyagnea 
+LVTGG GF G h+vr Li+ +++ g +V + + + 

25964 12 YLVTGGCGFLGEHWRMLLQReprLGELRV FD QHLGPW 49 

_ r — — --*l^v a *egapryt^ 

++e + g r+t ++GD+ + + + +a +ViH A++ V +r 

25964 50 LEELKTGPVRVTAIQGDVTQAHEVAAAVAGA- -HWIHTAGLVDVf GR- - 95 

ekPlayidtNwGTltLLEaaRnYWsaLdetkagvkkfvfsSTdeVYGdl 
P + + Nv GT + +Ea+ g +v+ S+ eV G + 

25964 96 AS P KT I HEVNVQGTRNVI EACV QTGTRFLVYTSSMEWGPN 136 

esiPisaF. . . tEdtPynPs. . SPYgaSKassEllvrayhraygLpaiiL 
++ + F ++ EdtPy ++ PY SKa E lv + 
25964 137 TKGHP - - FyrgNEDTP YEAVhrHPYPCSKALAEWLVLEAN 174 

RyFNvYGpyqsgriGedpngfpekLIPliiqnalgkgeplpvYGdDYpTp 
G+ g+ +P1+ + al p +YG 
T 25964 175 GRKVRGG LPLV-TCALR- - -PTGIYG 196 

DGtqv.RDw . ihVeDharANhllaltkg 

+G q+ RD+ +++ + ++ + + + ++++++ v ++a h+la +++ 
25964 197 EGHQImRDFyrqglrlggwlf raipasvehgrVYVGNVAWM-HVLAAREL 245 

• raGkgsevYNiGg 

+++ +++ ++++ ++++++ ++ + + +G 
25964 246 eqraalmggqvyf cydgspyrsyedfnmef lgpcglrLVG 285 

gneysnlEvVealekllgelaPekphvkakedpatfvddRpGddarya . . 
+ + + + +++++1+ l + ++ +++ ++++a 

25964 286 - ARPLLP YWLLVFLAALNALLQWL LRPLVDTAPLLN- - PYTLAva 327 



25964 



aDasKikreLGWkPevtnleeGladTvnWylene< - * 

- +++ +++ +K++I G++P + e+ +T+ W + 
328 n 1 1 f t VS TDKAQ RHFG YE P L F S - WED S RTRT I LWVQ AAT 365 
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Prodomld 



Start 



End 



Description 



Score 



VrewTrbdom 1280 TB tfxef 



Showing match 



|Go! 



11 



362 



p9.9.2J39)..3BHS(5) 3BHl(4) 3BH2(3) // DEHYDROGENASE . 
STEROID BETA-HYDROXYSTEROID 3BETA-HSD 
DEHYDROGENASE/DELTA 5->4-ISOMERASE INCLUDES: 
PROGESTEROrC3-Bl^-ITiT)ROXY-DELTA5-STEROID 
3-BETA-HYDROXY-5-ENE 



Prodomld 



Start 



End 



Description 



395 



Score 



View Prodom 1280 1 Boxer 1^1 | Showing match |t| |Go!| 

>1280 p99.2 (39) 3BHS(5) 3BH1(4) 3BH2(3) // DEHYDROGENASE STEROID 

BETA-HYDROXYSTEROID 3BETA-HSD DEHYDROGENASE/DELTA 5-->4-ISOMERASE 
INCLUDES: PROGESTERONE 3 -BETA-HYDR0XY-DELTA5-STER0ID 3 - BETA- HYDROXY- 5 -ENE 
Length =416 

Score = 395 (144.1 bits), Expect = 3.2e-42, Sum P(2) = 3.2e-42 
Identities = 99/268 £36%) , Positives = 134/268 (50%) 



Query: 


102 


Sbjct: 


157 


Query: 


162 


Sbjct: 


209 


Query: 


221 


Sbjct: 


269 


Query: 


279 


Sbjct: 


329 


Query: 


335 


Sbjct: 


386 


Score 


= 65 



++ NVQGTRN+IE C 



RF 



MEV GPN+ 



G+E+ +E+ +PYP 



SK +AE VL ANG ++ G L TCALRP IYGEG + + 



Q L+ GG +FR 



VYVGNVAW H+ 



+ GQ Y+ D +P++SY+D N 



GLRL ++ LP 



YW 



N + + ++NTTFT S 



KAQR GYEPL SWE+++ +T W++ 



•9 bits), Expect = 3.2e-42, Sum P(2) = 3.2e-42 
Identities = 11/23 (47%), Positives = 17/23 (73%) 

Query: 11 VYLVTGGCGFLGEHVVRMLLQRE 33 

VY VTGG FLG ++V++L+ + 
Sbjct: 14 VYAVTGGAEFLGRYIVKLLISAD 36 
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Input file Fbh21686Fl.seq; Output File 21686. trans 
Sequence length 1209 

M S L R 4 

CCCACGCGTCCGCCCACGCGTCCGCGGACGCGTGGGCGGACGCGTGGGCGCCCGCCTCGA ATG TCC CTG AGA 12 

PRRACAQLLWHPAAGMASWA 24 

CCC AGA AGG "GCC TGC GCT CAG "CTG CTC TGG CAC CCC GCT GCA GGG ATG GCC TCC TGG GCT 12 

KGRSYLAPGLL QGQVAIVTG 44 

AAG GGC AGG AGC TAC CTG GCG CCT GGT TTG CTG CAG GGC CAA GTG GCC ATC GTC ACC GGC 132 

GATGIGKAIV-. KELLELGSNV 64 

GGG GCC ACG GGC ATC GGA AAA GCC ATC GTG AAG GAG CTC CTG GAG CTG GGG AGT AAT GTG 192 

VIASRKLERLKSAADELQAN 84 

GTC ATT GCA TCC CGT AAG TTG GAG AGA TTG AAG TCT GCG GCA GAT GAA CTG CAG GCC AAC 252 

LPPTKQARVI P I QCNI R N E E 104 

CTA CCT CCC ACA AAG CAG GCA CGA GTC ATT CCC ATA CAA TGC AAC ATC CGG AAT GAG GAG 312 

EVNNLVKSTL ~H TFGKINFLV 124 

GAG GTG AAT AAT TTG GTC AAA TCT ACC TTA GAT ACT TTT GGT AAG ATC AAT TTC TTG GTG 372 

NNGGGQFLSP AEHISSKGWH 144 

AAC AAT GGA GGA GGC CAG TTT CTT TCC CCT GCT-GAA CAC ATC AGT TCT AAG GGA TGG CAC 432 

A V L E T N L T G T F Y M C K A V Y S S 164 

GCT GTG CTT GAG ACC AAC CTG ACG GGT ACC TTC TAC ATG TGC AAA GCA GTT TAC AGC TCC 492 

WMKEHGGSIVNI IVPTKAGF 184 

. TGG ATG AAA GAG CAT GGA GGA TCT ATC GTC AAT ATC ATT GTC CCT ACT AAA GCT GGA TTT 552 

«,._ P L A V H S G A A R A G V Y N L T K S L 204 

_ CCA TTA GCT GTG CAT TCT GGA GCT GCA AGA GCA GGT GTT TAC AAC CTC ACC AAA TCT TTA 612 

ALEWACSGIRINCVAPGVIY 224 

"GCT TTG GAA TGG GCC TGC AGT GGA ATA CGG ATC AAT TGT" GTT GCC CCT GGA GTT ATT TAT 672 

SQTAVENYGSWG QS FFEGSF 244 

TCC CAG ACT GCT GTG GAG AAC TAT GGT TCC TGG GGA CAA AGC TTC TTT GAA GGG TCT TTT 732 

QKIPAKRIGVPEEVSSVV CF 264 

..CAG_AAA ATC CCC GCT AAA CGA ATT GGT GTT CCT GAG GAG GTC TCC TCT GTG GTC TGC TTC 792 

I- LSPAASFITGQSVDVDGGR 284 

CTA CTG TCT CCT GCA GCT TCC TTC ATC ACT GGA CAG TCG GTG GAT GTG GAT GGG GGC CGG 852 

S L Y T H S Y E V P D H D N W P K G A G 304 

AGT CTC TAT ACT CAC TCG TAT GAG GTA CCA GAT CAT GAC AAC TGG CCC AAG GGA GCA GGG 912 

DL SVVKKMKETLKEKAK L* 323 

GAC CTT TCT GTT GTC AAA AAG- ATG AAG GAG ACC TTA AAG GAG AAA GCT AAG CTC TGA 969 

GCTGAGGAAACMGGTGTCCTCCATCCCCACTGOT 

<5TTGGTATGGAAAA(ZATTTTTCTTATTTTTMGTGTTATTAATTATATCTATGGAAAAACTAT 




GTCTTATGTCCCAAAAAAAAAA 
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CLUSTAL W (1.74) multiple sequence alignment 



" 5t) 52204 jSDRjfST — - - - - -^SWSGQS YLAA^ 

21686 MSLRPRRACAQLLWHPAAGMASWAKGRS YLAPGLLQGQVAIVTGGATGIGKAIVKELLEL 

*.** 4 * .**«* .**** # *** ; ************ .**« * 

5052204_SDR_rat GCNWIASRKLDRLTAAVDELRASQPPSSSTQVTAIQCNIRKEEEVNNLVKSTLAKYGKI 

21686 GSNWIASRKLERLKSAADELQANLPPTKQARVIPIQCNIRNEEEVNNLVKSTLDTFGKI 

*.*********.**. .*.***. * # **;..::* .******;************ , : *** 

5052 2 04 SDR rat NFLVNNAGGQFMAPAEDITAKGWQAVIETNLTGTFYMCKAVYNSWMKDHGGSIVNIIVLL 

21686 NFLVNNGGGQFLSPAEHISSKGWHAVLETNLTGTFYMCKAVYSSWMKEHGGSIVNIIVPT 

****** # **** ..****.. ***.**.*******************.**** ****** 

50522G4_SDR_rat NNGFPTAAHSGAARAGVYNLTKTMALTWASSGVRINCVAPGTIYSQTAVDNYGELGQTMF 
21686" KAGFPLAVHSGAARAGVYNLTKSLALEWACSGIRINCVAPGVIYSQTAVENYGSWGQSFF 

; *** *.**************. .** **,**;***************.***_ **..* 

50522 04^£DR_rat EMAFENIPAKRVGLPEEISPLVCFLLSPAASFITGQLINVDGGQALYTRNFTIPDHDNWP 

21686 EGSFQKIPAKRIGVPEEVSSWCFLLSPAASFITGQSVDVDGGRSLYTHSYEVPDHDNWP 

* .*..*****. *.***.* # .**************.* ..****..***.. .******* 

5052204_SDR_rat VGAGDSSFIKKVKESLKKQARL 

21686 KGAGDLSWKKMKETLKEKAKL 

**** *•**.**.**..*.* 

• • « « • • « 
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Signal Peptide Predictions for 2 1686 



Method 


Predict [Score 


Mat® 


Sigaal^eukarjcQte) 


MAYBE J J 


20 _ 



Note: amino-terminal 70aa used for signal peptide prediction 
Transmembrane Segments Predicted by MEMS AT 



Start [End 


Orient || Score 


29 [50 


ins— >out|| 0.9 


170 | 188 


out— >ins|| 0.2 


208 [224 


ins-->out|| 0.6 


258 f275 


out— >insj| 2.6 



>21686 

MSLRPRRACAQLLWHPAAGMASWAKGRAYLAPGLLQGQVAIVTGGATGIGKAIVKELLEL 
GSNWIASRKLERLKSAADELQANLPPTKQARVIPIQCNIRNEEEVNNLVKSTLDTFGKI 
NFLVNNGGGQFLSPAEHISSKGWHAVLETNLTGTFYMCKAVYSSWMKEHGGSIVNIIVPT 
KAGFPLAVHSGAARAGVYNLTKSLALEWACSGIRINCVAPGVIYSQTAVENYGSWGQSFF 
EGSFQKIPAKRIGVPEEVSSWCFLLSPAASFITGQSVDVDGGRSLYTHSYEVPDHDNWP 
KGAGDLSWKKMKETLKEKAKL 



Transmembrane Segments for Presumed Mature Peptide 



Start (End 


Orient || Score 


10 y 3 L 


ins— >out|| 0.9 


151 [169 


out— >ins 


0.2 


189 |205 


ins— >out|| 0.6 


239 |256 


(out— >ins| 2.6 



>21686_mature 

MASWAKGRSYLAPGLLQGQVAIVTGGATGIGKAIVKELLELGSNWIASRKLERLKSAAD 
ELQANLPPTKQARVIPIQCNIRNEEEVNNLVKSTLDTFGKINFLVNNGGGQFLSPAEHIS 
SKGWHAVLETNLTGTF YMCKAVYS S WMKEHGGS I VNI I VPTKAGFPLAVHSGAARAGNYN 
LTKSLALEWACSGIRINCVAPGVIYSQTAVENYGSWGQSFFEGSFQKIPAKRIGVPEEVS 

SWCFLLSPAASFITGQSVDVDGGRSLYTHSYEVPDHDNWPKGAGDLSVVKKMKETLKEK 
AKL 



Fig. 19 



1.1 



Applicants: Rachel E. Meyers, et al. 

Title: 21481, A NOVEL DEHYDROGENASE MOLECULE AND USES 

THEREEOR 
Attorney /A gent: Kerri Pollard Schray 
Docket No.: M PI0O-O79P I RCP2CN I M 

Sheet 27 of 43 

27/43 



Protein Family / Domain Matches, HMMer Version 2 

Searching for complete domains in PFAM 

hmmpfam - search a single seq against HMM database 

HMMER 2 •1.1- (Dec 1-938 ) - - - — 

Copyright (C) 1992-1998 Washington University School of Medicine 
HMMER is freely distributed under the GNU General Public License (GPL) . 



HMM file: 
Sequence file: 



/prod/ddm/seqanal/PFAM/pf am4 . 4/Pf am 
/prod/ddm/wspace/orf anal/oa-script . 19160 . seq 



Query: 21686 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



adh short short chain dehydrogenase 162,5 
adh short C2 short chain dehydrogenase/reductase C-te 47.2 



7.3e-45 1 
3.7e-10 1 



Parsed for domains: 

Model Domain seq-f seq-t 



hmm-f hmm-t 



score E-value 



adhshort 
adh short C2 



1/1 
1/1 



38 
250 



226 
280 



1 203 {] 162,5 7.3e-45 
1 31 [1 47.2 3.7e-10 



Alignments of top- scoring domains: 

adhshort: domain 1 of 1, from 38 to 226: score 162.5, E = 7.3e-45 

* - >KvaLvTGassGIGlaiAkrLakeGakVwadrneeklekGavakelk 
+va+vTG++ GIG+ai+k+L++ G +Vv+a x e+1 ++++ 
21686 38 QVAIVTGGATGIGKAIVKELLELGSNWIASRKLERL KSAAD 79 



21686 



21686 



21686 



21686 



elGgnd. . . .kdralaiqlDvtdeesv.aaveqavorlGrlDvLVNNAGg 
el +n+++++ r+++iq++++ ee+v+++v+ ++ +G+++ LVNN Gg 
80 ELQANLpptkQARVIPIQCNIRNEEEVnNLVKSTLBxPGKINFLVNNGGG 129 

.iillrpgpfaelsrtmeedwdrvidvNltgvflltravlplmamkkrgg 
+++ p++ +s + w +v+++Nltg+f++++av +k +g 

130 qFL SPAEHIS SKGWHAVLETNLTGTFYMCKAVYS - - SWMKEHG 170 

GrlvNiSSvaGrkegglvgvpggsaYsASKaAvigltrsLAlElaphglr 
G+IvNi + g+p ++ +A+ a+v lt+sLAlE+a glr 

171 GSIVNIIV-PT -KAGFPLAVHSGAARAGVYNLTKSLALEWACSGIR 214 



VnavaPGgvdTdo 
+n+vaPG ++ + 
215 INGVAPGVIYSQ 



226 



adh_short_C2 : domain 1 of 1, from 250 to 280: score 47.2, E = 3.7e-10 

* - >gRlGePeEiAnawFLASdaAsYiTGqt lvV< - * 
+R G PeE++++v FL S+aAs+iTGq + V 
21686 250 KRIGVPEEVSS WCFLLS PAASFITGQSVDV 280 
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foDom Matches 



Prodom M 


Start 


End 


Description 


Score 


View Prodom 121622 1 Boxer !▼ 


29 


82 


p992 ( I ) YS05 CAEEL // HYPOTHETICAL 98.0 KD 
PROTEIN F56D1.5 IN CHROMOSOME II 
TRANSMEMBRANE 


70 


knowing match )▼ Oo! 




View Prodom 953011 Boxer !▼ 




so 

OZ 


p99.2 (I) 027957_ARCFU // SHIKIMATE 
D-UbHYUKOObNASb AROE HYPOTHETICAL 
PROTEIN 


86 


| Showing match | ▼] |Go!| 


View Prodom 11 | Boxer M 


37 


231 


p992 ( 1078) ADH(34) GALE(20) FABG( 13) // 

AVrnAD Chi I7TA CC DD/YPPIXT rvrt r\;r\riA/^nii r a r>w-» 

UAU\AJKJbUUC I Abb FKulblN DEHYDROGENASE 
NAD REDUCTASE NADP BIOSYNTHESIS 
SYNTHASE ALCOHOL PUTATIVE 


157 


| Showing match ▼ Go!| 




View Prodom 737531 Boxer M 


237 


286 


p99.2 ( 1) P7 1079 BACSU // UNIDENTFDED 
DEHYDROGENASE 


84 


| Showing match (▼ |Go!| 


View Prodom 772231 Boxer !▼ 


243 


287 


p99.2 (1)007882 STAXY// 
GLUCOSE-l-DEHYDROGENASE 


92 


| Showing match T T llGo!| 


Prodomld 


Start 


End 


Description 


Score 



View Prodom 11 



| Boxei 



Showing match |t| |Goi| 



>11 p99.2 (1078) ADH(34) GALE (20) FABG (13 ) // OXIDOREDUCTASE PROTEIN 

DEHYDROGENASE NAD REDUCTASE NADP BIOSYNTHESIA SYNTHASE ALCOHOL PUTATIVE 
Length =269 

Score = 157 (60.3 bits). Expect = 1.2e-09, P = 1.2e-09 
Identities = 64/213 (A0%) , Positives = 106/213 (49%) 

Query: 51 KAIVKELLELGSNVVIASRKLERLKSAADELQANLPPTKQA RVIPIQCNIRNEEEVN 107 

K +V S AS+ E ++A+TQA V + C++ + E+V 

Sbjct: 35 KWWSATSEESESTEASK- - ESAMEVSKAVNAEVSATMQAVGVTVTKVTCDVADVEDVE 92 

Query: 108 NLVKSTLDTF— -GKINFLVNNGGGQFLSP AEHISSKG WHAVLETNLTGTF 155 

LV++ + + F _GKI+ LVNN G ++P AE ++ + W V+E N+TGTF 

Sbjct: 93 KL VETWEE FSG~I HGK I D VLVNNAG - - VMAPKAVAE SMT E ET S DDE EWE EVI EVNVTGT F 150 

Query: 156 YMCKAVYS S WMK EHGGSIVNI - - IVPTKAGFP - - LAVHSGAARAGVYNLTKS 203 

+ +A + K G+IVN+ + + G P A +S A++A V + TKS 

Sbjct: 151 NLTQAALPAMKKFSDAAAKKRFVGTIVNVASVAGSTMGSPGSQAAYS-ASKAAVESFTKS 209 

Query: 204 LALE WACSG- -IRINCVAPGVIYSQTAVEN 231 

LA+E ++ S +R+N VAPG + + A+E+ 
Sbjct: 210 LAMELSPYSASVAMVRVNAVAPGYVETD-ALES 241 

Score = 103 (41.3 bits); Expect = 0.0021, Sum P(2) = 0.0021 
Identities = 32/100 (32%), Positives = 54/100 (54%) 



Query: 
Sbjct: 
Query: 
Sbjct: 



37 GQVAIVTGGA- -TGIGKAIVKELLELGSNWIASRKLERLKS- -AADE LOAN 84 

G+ +VTGG+ +GIG AI ++L E G+ W+ S E +S A+ E + A 

7 GKTVLVTGGSGFSGIGIAIARQLAEEGAKVVWSATSEES 66 

85 LPPTKQA RVIPIQCNIRNEEEVNNLVKSTLDTFGKIN 121 

+ T QA V + C++ + E+V LV++ ++ F 1+ 
67 VS ATMQ AVGVTVTKVTCDVADVED VEKLVETVVEEFSG I H 106 



Score = 37 (18,1 bits), Expect = 0.0021, Sum P(2) = 0.0021 
Identities = 9/23 (39%), Positives = 13/23 (56%) 



Query: 205 ALEWACSGIRINCVAPGVIYSQT 227 

ALE A +G+ + V PG + T 
Sbjct: 238 ALESATNGLSWTVRPGNVRVNT 260 
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View Prodom 77223 | Boxer | t| | Showing match |^| |Gol 



>77223 p99.2 (1) O07882_STAXY // GLUCOSE - 1 - DEHYDROGENASE 
Length = 67 

-Score = 92 (37.4 bits) , Expect = 0.00031, ¥ * 0:00031 : 
Identities = 19/45 (42%), Positives = 29/45 (64%) 

Query: 243 S FQKI P AKRIGVPEEVS S WCFLLSPAASFITGQS VD VDGGRS LY 287 

+ + IPAK IG ++V++V FL S A +1 G ++ VDGG + Y 
Sbjct: 15 TLEMIPAKEIGFADQVANVARFLCSDLADYIHGTTIYVDGGMTNY 59 



View Prodom 95301 1 Boxer | Showing match |t| |Go 1 



>95301 p99.2 (1) 027957ARCFU // SHIKIMATE 5 -DEHYDROGENASE 
AROE HYPOTHETICAL 

PROTEIN 

Length =108 

Score = 86 (35.3 bits), Expect = 0.0014, P = 0.0014 
Identities = 20/48 (41%), Positives = 31/48 (64%) 

Query: 35 LQGQVAIVTGGATGIGKAU/KELLELGSNWIASRKLERLKSAADELQ 82 

L G+ A+V GAG GKA LL++GS V++A+R E+ + A + L+ 
Sbjct: 10 LGGKTALWG-AGGAGKAAALALLDMGSTVIVANRTEEKGREAVEMLR 56 



View Prodom 73753 



Boxer 



Showing match 



Go! 



>73753 p99.2 (1) P71079BACSU // UNIDENTFIED DEHYDROGENASE 
Length =60 

Score = 84 (34.6 bits), Expect = 0.0023, P = 0.0023 
Identities = 20/50 (40%), Positives = 29/50 (58%) 

Query: 237 QSFFEGSFQKIPAKRIGVPEEVSSWCFLLSPAASFITGQSVDVDGGRSL 286 

+ E + Q PA R+ +++ V FL+S A I GQ++ VDGGRSL 
Sbjct: 9 EDLLEDARQNTP AGRMVE IKDMVDTVEFLVSSKADMI RGQTI IVDGGRSL 58 



View Prodom 121622 | Boxer |^] | Showing match |t| |Go! 



>121622 p99.2 (1) YS05CAEEL // HYPOTHETICAL 98.0 KD PROTEIN F56D1.5 IN 
CHROMOSOME II TRANSMEMBRANE 
Length = 194 

Score = 70 (29.7 bits), Expect =7.6, P = 1.0 
Identities = 20/57 (35%), Positives = 29/57 (50%) 

Query: 29 YLAPG; ; QGQV- - AIVTGGATGIGKAIVKELLELG-SNWIASRKLERLKSAADELQ 82 

+ P L Q Q +V+GG GIGKA EL + G V+ R ++L S E++ 
Sbjct: 62 FYKPNLEQYQHRWTWSGGTDGIGKAYTLELAKRGLRKFVLIGRNPKKLDSVKSEIE 118 



Fig. 21B 
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GGAATGGATGCTGTTGGCTTAAACCTCCCCCTGCCCTGGGGGTTGCAACCAGGGTCTCTG 
CAAAGCCAATCCTTJ!GXCATCCCGCTGTCCTGCAGAGCAA GA,TGGGGCTCATGGCTGTGC 
TGATGCTACCCCTGCTGCTGCTGGGAATCAGCGGCCTCCTCTTCATTTACCAGGAGGCAT 
CCAGGCTGTGGTCGAAGTCTGCCGTGCAGAACAAAGTGGTGGTCATCACAGATGCCATCT 
CAGGACTGGGAAAGGAGTGTGCTCGGGTGTTCCATGCAGGTGGGGCAAGGCTGGTGCTGT 
GTGGAAAGAACTGGGAGGGACTGGAGAGCCTCTATGCCACCTTGACCAGTGTGGCTGACC 
CCAGCAAGACATTCACCCCCAAGCTGGTCCTCCTGGATCTCTCAGACATTAGCTGTGTTC 
AAGATGTGGCCAAAGAGGTCCTGGACTGCTACGGCTGTGTGGACATCCTCATCAACAATG 
CCAGCGTGAAAGTGAAGGGGCCTGCCCACAAGATTTCCCTGGAGCTTGACAAAAAGATCA 
* TGGATGCCAACTACTTCGGACCCATCACTTTAACCAAAGTTCTGCTTCCCAACATGATCT 
CCAGGAGAACAGGCCAGATTGTGTTAGTGAACAACATGCAAGCGAAGTTTGGAATCCCGT 
TCCGCACAGCTTATGCAGCCTCTAAGCATGCCGTCATGGGCTTCTTTGACTGCCTCCGAG 
CCGAGGTTGAGGAATACGATGTTGTGGTCAGCACCGTGAGCCCAACTTTCATCCGCTCCT 
ACCGTGCTTCCCCTGAGCAAAGAAACTGGGAGACATCCATTTGTAAATTCTTCTGCAGGA 
AGCTAGCCTATGGCGTGCACCCGGTGGAGGTGGCTGAGGAAGTGATGCGCACAGTACGGA 
GGAAGAAGCAAGAGGTGTTCATGGCCAACCCGGTTCCTAAGGCTGCCGTGTTCATCCGCA 
CCTTCTTCCCTGAGTTCTTCTTCGCTGTGGTGGCCTGTGGGGTGAAGGAGAAGCTCAATG 
TCCCAGAAGAGGGT TAACCTCGTGGCCAAAGGGGTCACTCAAGGGGAATAAAGGCTTTHH 
TAGAGAAAAAAAAAAAAAAAAAAAAAAA 




Fig. 31 A 
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MGLMAVLMLPLLLLGISGLLFIYQEASRLWSKSAVQNKVWITDAISGLGKECARVFHAG 
GARLVLCGKNWEGLESLYATLTSVADPSKTFTPKLVLLDLSDISGVQDVAKEVLDCYGCV 
DILINNASVKVKGPAHKISLELDKKIMDANYFGPITLTKVLLPNMISRRTGQIVLVNNIQ 
AKFGIPFRTAYAASKHAVMGFFDCLRAEVEEYDVWSTVSPTFIRSYRASPEQRNWETSI 
CKFFCRKLAYGVHPVEVAEEVMRTVRRKKQEVFMANPVPKAAVFIRTFFPEFFFAWACG 
VKEKLNVPEEG. 



Fig. 31 B 



^10n * Applicants: Rachel E. Meyers, ct al. 

X. Title: 21481, A NOVEL DEHYDROGENASE MOLECULE AND USES 

% THEREFOR 
tj®^ C Attorney/Agent: Kerri Pollard Schray 

V cfi Docket No.: MPI00-079PI RCP2CN1 M 

Sheet 41 of 43 




41/43 

GAP of: FrGcgManager_31JJFAHDJyG_ check: 516 from: 1 to: 936 
M21481 ORF - Import - vector trimmed 

to: FrGcgManager_31_VFAOzr_19 check: 2871 from: 1 to: 933 
h21481 ORF - Import - vector trimmed 

Symbol comparison table: /ddm_local/gcg/gcg_9 .1/gcgcore/data/rundata/ 
nwsgapdna . cmp 
CompCheck: 8760 

Gap Weight: 12 Average Match: 10,000 

Length Weight: 4 Average Mismatch: 0,000 

Quality: 8220 Length: 936 

Ratio: 8.810 Gaps: 0 

Percent Similarity: 88,.1D3 Percent Identity: 88.103 

Match display thresholds for the alignment (s) : 
] = IDENTITY 
: = 5 

• = T 

FrGcgManager_31_UFAHDJyG_ x FrGcgManager_31_VFA0zr_19 

• • • « « 

1 ATGGGGCTCATGGCTGTCCJGATGCTACCCCTGCTGCTGCTGGGAATCAG 50 

inn limn i mum miiiimiimmimi 

1 atgggagtcatggccatgctgatgctccccctgctgctgctgggaatcag 50 

• • ■■—«■■ « « . 

51 CGGCCTCCTCTTCATTTACCAGGAGGCATCCAGGCTGTGGTCGAAGTCTG 100 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i i 1111 MM M | || 1 1 HI | 

51 cggcctcctcttcatttaccaagaggtgtccaggctgtggtcaaagtcag 100 

• • '• • • 
101 CCGTGCAGAACAAAGTGGTGGTCATCACAGATGCCATCTCAGGACTGGGA 150 

ini i miiiiiiiiiiiiimi mil iiiiiiiiiimiiimi 

101 ctgtgcagaacaaagtggtggtgatcaccgatgccatctcaggactgggc 150 

• • • • • 
151 AAGGAGTGTGCTCGGGTGTTCCATGCAGGTGGGGCAAGGCTGGTGCTGTG 200 

1C1 lllllllllllllllllllllll 1111111111111111111111111 

151 aaggagtgtgctcgggtgttccacacaggtggggcaaggctggtgctgtg 200 
201 TGGAAAGAACTGGGAGGGACTGGAGAGCCTCTATGCCACCTTGACCAGTG 250 

oni imiimiiiiiii i ii mi m mi miii in i 

201 tggaaagaactgggagaggctagagaacctatatgatgccttgatcagcg 250 

• • • • • 
251 TGGCTGACGCCAGCAAGACATTCACCCCCAAGCTGGTCCTCCTGGATCTC 300 

iimmmmmmiiiiiiii iiiiiiimi mi mi 

251 tggctgaccGcagcaagacattcaccccaaagctggtcctgttggacctc 300 



Fig. 32 A 
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• • • • « 

301 TCAGACATTAGCTGTGTTCAAGATGTGGCCAAAGAGGTCCTGGACTGCTA 350 

llllllll MINIM I IIIMIMI Mill MINIM INN 

301 tcagacatcagctgtgtcccagatgtggcaaaagaagtcctggattgcta 350 
351 CGGCTGTGTGGACATCCTCATCAACAATGCCAGCGTGAAAGTGAAGGGGC 400 

NNNNNNNNIIININNIMNI Mill INN INN 

351 tggctgtgtggacatcctcatcaacaatgccagtgtgaaggtgaaggggc 400 

• • • • • 
401 CTGCCCACAAGATTTCCCTGGAGCTTGACAAAAAGATCATGGATGCCAAC 450 

lllllll llllllll llllllll MMMMMMNMNINN 

401 ctgcccataagatttctctggagctcgacaaaaagatcatggatgccaat 450 
451 TACTTCGGACCCATCACTTTAACCAAAGTTCTGCTTCCCAACATGATCTC 500 

INN II I It 11 M 1 II II INI IINIIMNINNNIN 

451 tactttggccccatcacattgacgaaagccctgcttcccaacatgatctc 500 

• « • • « 

501 CAGGAGAACAGGCCAGATTGTGTTAGTGAACAACATCCAAGCGAAGTTTG 550 

I INN INI INI II I II II III II I II lllllll III INN 

501 ccggagaacaggccaaatcgtgttagfcgaataatatccaagggaagtttg 550 

• « • « « 

551 GAATCCCGTTCCGCACAGCTTATGCAGCCTCTAAGCATGCCGTCATGGGC 600 

INN II WWW II INI II Mill Mill II lllllll 

551 gaatcccgttccgtacgacttacgctgcctccaagcacgcagccctgggc 600 

• • « • • 

601 TTCTTTGACTGCCTCCGAGCCGAGGTTGAGGAATACGATGTTGTGGTCAG 650 

1 1 11 1 ! 1 1 1 M > M 1 ! 1 1 M I M II II NJ III II III INI INI 

601 ttctttgactgcctccgagccgaagtggaggaatacgatgttgtcatcag 650 
651 CACCGTGAGCCCAACTTTCATCCGCTCCTACCGTGCTTCCCCTGAGCAAA 700 

IIIIIINIIII.il II INI IN II INI -I I 1.1 II III I 

651 caccgtgagcccgactttcatccggtcgtaccacgtgtatccagagcaag 700 

• « • • « 
701 GAAACTGGGAGACATCCATTTGTAAATTCTTCTGCAGGAAGCTAGCCTAT 750 

llllllllll I llllllll llllllll I INN INI 

701 gaaactgggaagcttccatttggaaattctttttcaggaagctgacctac 750 
751 GGCGTGCACCCGGTGGAGGTGGCTGAGGAAGTGATGCGCACAGTACGGAG 800 

milium n mimi inn iiiiiiiiiii n inn 

751 ggcgtgcacccagtagaggtggcggaggaggtgatgcgcaccgtgcggag 800 

• * • • « 

801 GAAGAAGCAAGAGGTGTTCATGGCCAACCCGGTTCCTAAGGCTGCCGTGT 850 

0I11 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 IIIIIIIIIII I U MM lllllll 

801 gaagaagcaagaggtgtttatggccaaccccatccccaaggccgccgtgt 850 

• • « • « 

851 TCATCCGCACCTTCTTCCCTGAGTTCTTCTTCGCTGTGGTGGCCTGTGGG 900 

ci I III INI III II I III llllllll II III lllllllllllllll 

851 acgtccgcaccttcttcccggagttctttttcgccgtggtggcctgtggg 900 

• m • - 

901 GTGAAGGAGAAGCTCAATGTCCCAGAAGAGGGTTAA 936 

iiiiiiiiiiiii!niiiiin ii mil 

901 gtgaaggagaagctcaatgtcccggaggagggg. 933 

Fig. 32B 
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GAP of: FrGcgManager_3 2ZFA004eiD check: 657 from: 1 to: 311 
m21481 aa - Import - complete 

to: FrGcgManager_3 2_AGAjaPna_ check:: 9949 from: 1 to: 311 
h21481 aa - Import - complete 

Symbol comparison table : /prod/ddm/seqanal/BLAST/matrix/aa/BLOSUM62 
CompCheck: 1102 
Matrix made by matblas from blosum62.iij 

Gap Weight: 12 Average Match: 2.778 

Length Weight: 4 Average Mismatch: -2.248 

Qual i ty : 14 67 Length : 311 

Ratio: 4.717 Gaps: 0 

Percent Similarity: 92.926 Percent Identity: 91.318 

Match display thresholds for the alignment (s) : 
| = IDENTITY 
: = 2 
1 



FrGcgManager_32_ZFA0 04eiD x FrGcgManager_32_AGAj aPna^. 

1 MGLMAVLMI^LLLLGISGLLFIYQEASRLWSKSAVQNKVWITDAISGLG 50 

lUMMIMMIMMMIilil IIIIMIIIIIIIIIIMIIMII 

1 MGVMAMLMLPLLLLGISGLLFIYQEVSRLWSKSAVQNKVWITDAISGLG 50 
51 KECARVFHAGGARLVLCGKNWEGLESLYATLTSVADPSKTFTPKLVLLDL 100 

MINIM MIMMIMMI ll<ll I MMMIMMIMIMI 

51 KECARVFHTGGARLVLCGKNWERLENLYDALISVADPSKTFTPKLVLLDL 100 

• • • • • 

101 SDISCVQDVAKEVLDCYGCVDILINNASVKVKGPAHKISLELDKKIMDAN 150 

Mini iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

101 SDISCVPDVAKEVLDCYGCVDILINNASVKVKGPAHKISLELDKKIMDAN 150 

• • • • • 

151 YFGPITLTKVLLPNMISRRTGQIVLVNNIQAKFGIPFRTAYAASKHAVMG 200 

lllllllll llllllllllllllllllll llllllll lllllll :| 
151 YFGPITLTKALLPNI-MSRRTGQIVLVNNIQGKFGIPFRTTYAASKHAALG 200 

201 FFDCLRAEVEEYDVWSTVSPTFIRSYRASPEQRNWETSICKFFCRKLAY 250 

llllllllllllllhlllllllllll III III II III III I 
201 FFDCLRAEVEEYDWISTVSPTFIRSYHVYPEQGNWEASIWKFFFRKLTY 250 

• • • .• • • 

251 GVHPVEVAEEVMRTVRRKKQEVFMAMPVPKAAVFI RTF FPEFF FAWACG 300 

IIIIIIIIIIIIIIIIMIIIIIIIIhllllhHIIIIIIMIIIIII 

251 GVHPVEVAEEVMRTVRRKKQEVFMANPIPKAAVYVRTFFPEFFFAWACG 300 



301 VKEKLNVPEEG 311 

HIIIIIIIII _. 
301 VKEKLNVPEEG 311 P|g^ 33 



